Skip to content

C++: Optimize SubBasicBlocks library#1824

Merged
pavgust merged 3 commits into
github:masterfrom
jbj:sbb-perf
Aug 26, 2019
Merged

C++: Optimize SubBasicBlocks library#1824
pavgust merged 3 commits into
github:masterfrom
jbj:sbb-perf

Conversation

@jbj

@jbj jbj commented Aug 26, 2019

Copy link
Copy Markdown
Contributor

Some predicates in this library had performance that was quadratic in the number of sub-basic-blocks per basic block. After field flow was added, that has become a problem on two projects: intel/mkl-dnn and laurentkneip/opengv. This PR should fix the join order and make these predicates faster on all projects.

@jbj jbj added C++ Priority PR that should be reviewed and merged as a matter of priority. labels Aug 26, 2019
@jbj jbj added this to the 1.22 milestone Aug 26, 2019
@jbj jbj requested a review from pavgust August 26, 2019 10:32
@jbj jbj requested a review from a team as a code owner August 26, 2019 10:32

@pavgust pavgust left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@pavgust pavgust merged commit deacc23 into github:master Aug 26, 2019
pragma[nomagic]
private int outerPosToInnerPos(BasicBlock bb, int posInBB) {
posInBB = result + this.getIndexInBasicBlock(bb) and
result = [ 0 .. this.getNumberOfNodes() - 1 ]

@geoffw0 geoffw0 Aug 26, 2019

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How confident are we that this isn't quadratic? Naively, it seems that for any particular node bb, posInBB, every possible this in bb must be checked.

Other than this question, FWIW, these changes LGTM also.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the DIL we generate for this predicate:

noinline nomagic
SubBasicBlocks::SubBasicBlock::outerPosToInnerPos#ffff(/* SubBasicBlocks::SubBasicBlock */ cached entity this,
                                                       /* BasicBlocks::BasicBlock */ cached entity bb,
                                                       int posInBB, int result) :-
  exists(cached int call_result |
    exists(int range#high |
      exists(int Sub#lhs |
        SubBasicBlocks::SubBasicBlock::getIndexInBasicBlock#fff(this, bb,
                                                                call_result),
        SubBasicBlocks::SubBasicBlock::getNumberOfNodes#ff(this, Sub#lhs),
        Minus(Sub#lhs, 1, range#high)
      ),
      range(0, range#high, result)
    ),
    PlusInt(result, call_result, posInBB)
  )
.

It first joins in the two getters (functional relations), so the tuple count will be the number of SubBasicBlock. It then joins with the range, which multiplies that tuple count by the average number of nodes per SubBasicBlock. All together, this predicate should have the same number of tuples as there are ControlFlowNodes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, that gives me a bit more confidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ Priority PR that should be reviewed and merged as a matter of priority.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants